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Abstract The knowledge of receiver beam shapes is essential for accurate radio 
interferometric imaging. Traditionally, this information is obtained by holographic 
techniques or by numerical simulation. However, such methods are not feasible for 
an observation with time varying beams, such as the beams produced by a phased 
array radio interferometer. We propose the use of the observed data itself for the es- 
timation of the beam shapes. We use the directional gains obtained along multiple 
sources across the sky for the construction of a time varying beam model. The con- 
struction of this model is an ill posed non linear optimization problem. Therefore, 
fH | we propose to use Riemannian optimization, where we consider the constraints im- 

O [ posed as a manifold. We compare the performance of the proposed approach with 

traditional unconstrained optimization and give results to show the superiority of the 
proposed approach. 



1 Introduction 

Most interferometric observations are done using receivers that are more sensitive 
towards a part of the sky. This narrow field of view is attained using directive anten- 
nas (such as a dish) or by beamforming. Due to this reason, images made by such 
interferometric observations are distorted, with the distortion increasing for celestial 
objects further away from the direction where the beams are pointed at. Therefore, the 
knowledge of the beam shape is essential to correct for this distortion while producing 
accurate and distortion free images. Traditionally beam information is obtained by 



holographic techniques (Scott and Rvle 



1976t Bennet etal 1976 : Popping and BraunI 



2008 ) or by drift scanning dPober eta" 201 lh . These methods work well for a sta- 



tionary and stable beam pattern, such as the beams produced by movable dish based 
receivers. However, such techniques will not give accurate results for interferometers 
that have time varying beam shapes. A case in point is the beam shapes produced 
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by phased array radio telescopes such as lofabQ. During an observation with a 
phased array, the beamforming weights change and this results in a variation of the 
overall beam pattern. In addition, different element layouts between different stations 
also make the beam shape significantly different. Moreover, secondary effects such 
as mutual coupling would make the beam shapes different for each receiver. 

In this paper, we propose to use the observation itself to extract beam shape infor- 
mation rather than using a priori information such as by holography or drift scanning. 
Efficient te chniques are available to extract directiona l gains along multiple directions 
in the sky (I Yatawatta et alll2009t iKazemi et alll201 lbl) . We can extract these gains not 
only along the direction where the beams are pointed at, but also along other direc- 
tions where there are well known celestial sources to sufficiently sample the beam 
shape. Once these directional gains are available, the recovery of the beam shape is 
an ill posed nonlinear optimization problem. Therefore, we have to apply additional 
constraints to get a satisfactory soluti on. This natur a lly leads us to o ptimization on a 



Riemannian manifold, as discussed in lGabavl (1 1 9821) : iMantonl (120041) . As a byproduct 



of this process, we can also obtain intrinsic fluxes of the celestial sources used in 
calibration, subject to provision of a few known sources for absolute flux calibration. 

Manifold optimization has been applied in diverse areas of research and a com- 
plete overview is given in Absil et a"! (2008). In this paper, we present a hybrid op- 



timization method that jointly uses steepest descent (SD) and the Broyden Fletcher 
Goldfarb Shanno (BFGS) algo rithms on a Riemannian manifold. We use the geodesic 



stepping method (lFiorill201 II) based on a Riemannian gradient as our Riemannian 
steepest descent (RSD) method. However, SD method has the drawback of only hav- 
ing linear convergence rate, especially close to the sol ution. To accelerate the conver- 
gence, we use the Riemannian BFGS (lOi et a 1 2010) algorithm in conjunction with 
the RSD method. 



Together with calibration along multiple directions (lYatawatta et a ll2009t l IKazemi et all 



201 lbl) . the methods proposed in this paper to estimate the beam shape and intrinsic 
fluxes can also be considered as one cycle of self-calibration, where we also update 
the sky model. However, in this paper we focus our attention on the latter part of 
the self-calibration cycle, i.e., the estimation of beam shapes and the estimation of 
intrinsic fluxes. 

The rest of the paper is organized as follows: In section [2] we give an overview 
of radio interferometry. Next in section [3] we present the beam shape estimation 
approach and we elaborate on Riemannian optimization in section |4] In section [5] 
we give simulation results to verify the proposed approach and give conclusions in 
section [6] 

Notation: Matrices and vectors are denoted by bold upper and lower case letters 
as B and v, respectively. The canonical vector with a one at the p-th location and zeros 
everywhere else is given by e p . The transpose, Hermitian transpose and conjugation 
are given by (.) T , (.) H and (.)*, respectively. The matrix Kronecker product is denoted 
by ® and the Frobenius norm is given by ||.||. The set of complex numbers is denoted 
by C. 
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2 Radio Interferometry 



We give a brief overview of radio interferometry and calibration in this section. Con- 
sider an interferometer formed by station p and station q. The received data after 
correlation and correction for delay errors between stations can be given as 



M 



C 1 H + N 



(1) 



where \ pq (e C 2x2 ) is the visibility matrix dHamaker etalll 1996b and N pq (e C 2x2 ) is 



the noise. In (Q]i, the observation consists of radiation from M discrete sources in the 
sky, whose coherencies are given by C pqm (e C 2x2 ). For a point source with intensity 
l m and polarized flux Q m , U m , V m , the coherency for linearly polarized receptors is 
given by 

An + Qm U m + jVm 
Um jVm hit Qm 



' pqm 



(2) 



where <p pqm is the Fourier phase component that depends on the direction in the sky as 
well as the separation of station p and q. In calibration, we estimate the Jones matrices 
J pm (6 C 2x2 ) for each stati on as well as for each direction in the sky dYatawatta et all 



2009t iKazemi et alll201 lbl) . For each source, we have accurate knowledge of the di- 
rections (or positions) in the sky and only an apparent knowledge of the fluxes. The 
solutions obtained for J pm contai n the info r mation about the beam shape along each 



direction. However, as noted in lHamakerl d2000b 



there is always an ambiguity in 
these solutions and therefore, we cannot use the values of J pm to directly construct a 
beam model. Most celestial sources are unpolarized and thus, there is a unitary ambi- 
guity in the solutions and what we obtain is J pm V m where U m (e C 2x2 ) is an unknown 
unitary matrix. 



3 Beam Shape Estimation 

The beam shape of a phased array receiver consists of two parts. Each element used 
in beamforming has the element beam pattern that is sensitive to the full sky. Using 
beamforming, this beam is narrowed down to cover the field of interest in the sky. 
Therefore, for the p-th station, the beam gain along the m-th direction can be given 
as ypm^jp,,, where the element beam is given by E pm (e C 2x2 ). What we are interested 
in is the array gain y pm (e C), which is dependent on the beamforming weights and is 
changing as the weights change. This is 1 at the direction where the beam is pointed 
at and as the sky rotates, due to the continuous tracking of a single direction in the 
sky, the overall values for y pm change with time. 

Note that y pm is a complex valued parameter: Therefore, in general we estimate 
the voltage beam shape. Moreover, any atmospheric phase variations are also incorpo- 
rated into the value of y pm . Before we proceed, we make the following assumptions: 

- We have satisfactory knowledge of the element beam pattern E pm , mainly by 
numerical simulation. 
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- Although the calibration of ([TJ was performed using apparent fluxes of the M 
sources, we assume approximate knowledge of intrinsic fluxes of at least a few 
sources (we call them as seed sources). 

- We assume perfect knowledge of source positions and assume atmospheric phase 
errors are absorbed into the solutions of J pm , in addition to beam shape errors. 

We select a set of complex valued basis functions to model the beam shape. Let 
the number of basis functions be D and then we can evaluate the beam gain as 

y pm = e^Bb„, (3) 

where B (e C NxD ) gives the beam model for all N stations. The values of the basis 
functions along the m-th direction is given by b„, (e C Dxl ). The canonical vector is 
given as 

e p = [0,0,. ..,0,1,0,. ..,0f (4) 

with all zeros except a 1 at the p-th location. Ideally, for a source m with perfect 
knowledge of its fluxes we get 

C pqm y p „,y ' qm — JpmC pqm J qm (5) 

where C pqm (e C 2x2 ) is the true coherency, taking into account the element beam 
shapes E ; „„ and E ?m . Note that the ambiguities in J pm and ] qm cancel out and does 
not affect Q. 

Using (0 and (0, the cost function that needs to be minimized to estimate B can 
be given as 

/(B) — 11^ P<Pnf 'P"'7 'qm i pmC pqmi q,„ Ir (6) 

p,q,m 

where the summation is taken over all baselines p,q and all sources m whose intrinsic 
fluxes are known. The minimization of (O to yield an estimate for B is highly ill posed 
mainly due to not having enough sources with known fluxes as well as sufficient 
intensities to yield a good solution for J pm under noisy observations. Therefore, we 
impose the additional constraint that preserves the total power received by all stations. 
As shown in appendix|A] the total power constraint can be represented as 

trace(B H B) = a (7) 



where a is a fixed real value. Although we cannot exactly determine a, a nominal 
value based on the chosen basis functions and a nominal beam shape is sufficient. 

Apart from the cost function ([6j, we nee d the gradient of the cost functi on in our 
optimization routines. Using techniques of (Hi orungnes and Gesbertl 120071) . we get 
the derivative as (proof is given in appendix iBt 



— ^ ^ filpqml \pqm ^Ipqm^lpqm ^pqm^'ipqm 



(8) 
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where 

Plpqm = traCC {Cp^Cpqn^j , (9) 

Plpqm = trace {C IKjm i p m C pqmJ qyn) i 

fi'Spqm — trace ((J p m C pq m J q /n ) Cpq m ^j , 

r lp q m = e 9 b^(b«B ff e p )(e^Bb m )(b^B H e ? ) 

+ e p b„>^B w e/) )(e^Bb m )(b«B%), 

r 2pqm = t q b T m (b%B H t p ), 
r 3pqm = e p b T m (b^B H e q ). 

To summarize: we need to estimate B by minimizing the cost function (O subject 
to the constraint (0. As described in next section, we choose Riemannian optimiza- 
tion to solve this problem. 



4 Riemannian Optimization 



We give a brief description of the motivation behind using Riemanni an optimizatio n 
as opposed to tra ditional constrained optimization. As presented in iGabavl d 1982b : 
Absil et all d2008l) and other work, traditional constrained optimization (using La- 
grange multipliers) increases the dimensionality of the problem, therefore making it 
more complicated. On the other hand, the constraints ((O in our case) can be thought 
of as restricting B onto a Riemannian manifold. Therefore the dimensionality is not 
increased. However, the traditional gradient based optimization algorithms applica- 
ble in Euclidean space cannot be applied in a straight forward manner because the 
tangent planes change depending on the value of B. 

We present two optimization algorithms on th e man ifold (j7]i that we use in beam 
estimation. The first one is RSD, as pr esented in ( Fioril 2oTTh and the second one is 
RBFGS, as presented in (lOi et a 3 l2010h . The steepest descent method has linear con- 
vergence but simpler to implement while the RBFGS has super-linear convergence. 
Apart from the cost function the only requirement is the gradient {§). By using 
both algorithms in a hybrid fashion, we have faster convergence and are less suscep- 
tible to get stuck in a local minimum. 



4. 1 Riemannian Steepest Descent Algorithm 



We choose the method proposed in iFioril (120111) as our RSD algorithm. We b riefly 
give th e algorithm that we use for estimating B, more detail can be found in Fioril 
d201lh . 

1. Calculate |^ using © and ©. 
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2. Calculate the Riemannian gradient 

- V B / = — - -Breal ( trace(B H — ) | . (10) 
J dB a \ dW) 

3. Find the step size h in [0, In/u)}, a> = ||Vb/||/ V"? tnat minimizes 

f(B(h)) = /(Bcos(w/ 2 ) + (V B f) S m(coh)/co). (11) 

4. Update B <— B cos(coh) + (V B /) sm(a)h)/a>. 

5. If ||Vb/|| is too small or the maximum number of iterations has reached stop, else 
go back to step 1 . 



4.2 Riemannian Broyden Fletcher Goldfarb Shanno Algorithm 

In order to present the RBFGS algorithm, we use an alternative representation for the 
beam model B as follows 

x = [vec(real(B)) T vec(itnag(B)) T ] T / yfa (12) 

where x is a real vector of size 2ND x 1 . Then, the constraint (Q) can be rewritten as 

x r x=l (13) 

which makes x restricted to a (real) Stiefel manifold of size 2ND x 1 and dimension 
2ND - 1 . This also means that x is on a 2ND - 1 dimensional unit sphere (which is 
a special case o f a Stiefel man ifold). We adopt the BFGS algorithm on a unit sphere 
as presented in lOi et a 1 (2010). In order to fully implement this algorithm, we need 
to define several operators on the manifold. The projection of any vector i] to tangent 
space at x on the manifold is given by 

P x (i7) = (I-xx r )7/. (14) 

The cost function (0 can be expressed as /(x) = /(B) with some abuse of no- 
tation. The gradient is constructed from ([8]) by projecting it onto the tangent space 
as 

grad(f(x)) = (I - xx T )[ V ec(real&) T vec(ima 8 (^)) T ] T / Va. (15) 

OB oB 

The retraction of vector rj in the tangent space at x to the manifold is given by 

*M = (16) 
ll(x + iflll 

In Oi et al ( 2010l) . vector transport is used to transport a tangent vector from a 
tangent space at one point to the tangent space at another point on the manifold. This 
operator is given by 

T *^4- (x LT^ )r k (i7) 

\ Il(x + i7)|| 2 j 
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and its inverse (inverse vector transport) is given by 




(18) 



With these definitions at hand, we are ready to implement the RBFGS algorithm. 

- Initial conditions: Hessian approximation Hi = I. 

- Iterations k = 1 to max iterations 

1. Obtain j] k by solving = -gradf(x k ). 

2. Perform line search: set a = 1; c = gradf(x k ) 7 rj k 

- while f(R Xk (2ai] k )) - f(x k ) < ac, update a <— 2a. 

- while /(RxjXflj/jr.)) - f(x k ) > 0.5ac, update a <— 0.5a. 

3. Update x A . + i R Xk (ai] k ). 

4. s k = T Xl! (aJ] k ,aJ] k y, y k = gradf(x k+ i) - T Xli (ai] k , gradf(x k )) 

5. Update Hessian approximation as H k = T(x k+ \, aj] k )ll k T~ l (x k+ i, ar) k ) 



4.3 Hybrid Optimization 

With the RSD and RBFGS algorithms as implemented above, the implementation of 
the hybrid algorithm is as follows. 

1. Start with nominal beam shape Bo and a — frace(B ? Bo). 

2. In parallel, run RSD and RBFGS with maximum number of iterations fixed to n\ 
(about 10). 

3. Compare the final cost from both RSD and RBFGS algorithms. Select the solution 
with the lowest cost from either RSD or RBFGS as the updated value for B. 

4. If maximum number of hybrid iterations ri2 (about 200) is reached, stop. Else go 
back to step 2 with the updated B as the initial value. 

Note that in this algorithm, we use two limits for the number of iterations, the first 
one for each RSD and RBFGS iteration limit («i) and the second one for the hybrid 
iteration limit {ni). It should also be mentioned that the solution obtained for B always 
has an unknown complex scalar ambiguity. This can be eliminated by normalizing the 
peak of all the estimated beams to a pure real value. 

The initial selection of a is done by assuming a nominal beam model. Depending 
on additional information such as the beamformed element layout and the frequency 
of observation, and also depending on the basis functions chosen, it is possible to 
determine an accurate value for a. We also use the nominal beam model as our initial 
value in optimization. 

4.4 Flux Estimation 



and H i+ i = 




Once we have the estimate for B, it is also possible to estimate the intrinsic fluxes for 
all sources in our calibration model ([TJ. In order to do this, we make an additional 
assumption: 
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- All stations p see the same intrinsic sky, therefore, for a sky consisting of point 
sources C pqm - C m and the common Fourier phase term in (f2| and (0 can be 
precomputed. For an array with parallel dipoles (such as LOFAR), we assume the 
element beam pattern of each station is identical. Therefore, the dependence of p 
and q on C pqm is eliminated. 

Under this assumption, for the m-th source we define the cost to be minimized in 
order to estimate the flux as 

gm(C m ) = 2^ H C mypmy c j m Jpm^pqmJqmW ■ (19) 

PA 

By making d^ C|,l) - 0, we get the estimate 

~ _ Tip,ql 'pm) 'qmi pmC pqmJqm 
Yjp,q Y/ pm\^\y qm\" 

It should be reminded that we can use ( 120b to estimate fluxes for any point source 
along the direction of which we have obtained a calibration solution. Of course, for 
sources that are far away from the center of the beam, the denominator of d20b would 
get close to zero, making our flux estimate unreliable. This can be overcome by com- 
bining observations taken at different epochs. Once we have updated the sky model 
using d20b . we can go back to update our estimate of B. Therefore, with an updated 
sky model, we can use more seed sources to better constrain the estimation of the 
beam shape. In addition, this step also completes one self-calibration loop. 



5 Simulation Results 

We consider an observation with a field of view of 8 degrees (diameter) in the sky. 
We simulate M — 50 sources, randomly placed in the field of view with no intrinsic 
polarization Q,„ = U m = V m = and intensities I m varying from 1 to 20 flux units. 
The positions of the sources are shown in Fig. Q] while the circles sizes indicate the 
flux ratio between the apparent and true flux values. The number M — 50 was chosen 
to emulate a typical situation with a LOFAR observation at about 150 MHz with an 
average beam diameter of about 8 degrees. At much higher frequ encies, the beams are 
narrower and the sources are less bright, therefore, 'clustering' dKazemi et alluOl lab 
of sources may be required to get sufficient directions along which to calibrate. 

We simulate an interferometer with N - 6 stations. The beam shape of each 
station is generated to be a Gaussian with random major and minor axes and random 
offsets from the center of the field of view, as shown in Fig. [2] In addition, we multiply 
this with a random linear phase screen to make the beam complex. In order to generate 
the apparent sky model, we attenuate the intrinsic fluxes of the sky model with the 
mean of the amplitude of the beam shapes shown in Fig. [2] as this is what the fluxes 
that will be seen in an image made by this interferometer. The mean beam shape is 
shown in Fig. [3] We also corrupt the apparent fluxes with Gaussian noise, having zero 
mean and a variance of 0.01. 
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Fig. 1 Sky model in a field of view of 8 degrees in diameter. The circles correspond to the ratio between 
the apparent flux used in calibration and the intrinsic flux of each source. 



Once we have generated the apparent sky model, we calculate the gain along each 
direction using the true beam shape and the apparent flux. As an example, we give 
the gain variation along an azimuthal track for a direction 4 degrees away (in zenith 
angle) from the field center in Fig. |4] We only show the (1,1) entry of the matrix 
J pm in Fig. |4] For each direction m, we calculate the true Jones matrix J pm and use 
a randomly generated unitary matrix U m to get the values used in (0 as J ; „„U„, + N, 
where N is a complex Gaussian noise matrix, with elements having zero mean and a 
variance of 0.01. For a real observation, this step is replaced by calibration along the 
direction of each source in the sky model. 

We consider minimizing (0 by the proposed method as well as by the uncon- 

strain ed Broyden Fletcher Goldfarb Shanno (BFGS) optimization routine ( INocedal and Wright 
19991) . For both routines, we need to supply an initial value for B. We consider the 



initial beam for all stations to be a circular Gaussian with major and minor axes di- 
ameter of 4 degrees (half the field of view). We selected spherical harmonics with 
order 4 as the basis functions for B. Therefore, there are D - 16 basis functions and 
the size of B is 6 x 16. The initial value of B was used to calculate the value for 
a = trace{B H R). Spherical polar coordinates, centered at the pole of Fig. Q] are used 
to calculate the basis functions. 

The reduction of the cost function with the number of hybrid iterations of the 
proposed algorithm is shown in Fig. [5] At certain points of the iteration, RBFGS 
algorithm finds successful solutions and the cost is reduced at a rate which is super- 
linear. 

In Fig. [6] we have given the results of the unconstrained optimization with 2500 
iterations. The results of the proposed algorithm is shown in Fig. [71 after 250 hybrid 
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Fig. 2 Original beams: (a) Real value (b) Imaginary value. Each beam is a Gaussian in amplitude with a 
randomly chosen major and minor axis and a random offset from the center. The amplitude is multiplied 
by a random linear phase screen to make the beam complex. 



iterations. The inner iterations used is 10 so the total number of iterations for the 
proposed method is 2500 as well. 

Comparison of Figs.|6]and|7]with the original in Fig. [2] clearly shows the superi- 
ority of the proposed method. The real values of the beams in Figs.|6]and|7]indicate 
that the proposed method gives a more focused beam shape as opposed to the uncon- 
strained approach. Moreover, the proposed method recovers the imaginary value of 
the beams better than the unconstrained approach. Note that the beam number 4 in 
Fig. [2] has almost zero imaginary value (implying that the phase component is neg- 
ligible). While the proposed approach also gives a very small value for this beam in 
Fig. El me unconstrained approach gives a significantly higher value, as seen in Fig. 
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Fig. 3 Beam shape used for calculating the apparent sky fluxes. We start with a real beam so the imaginary 
value is zero. 

[6] In Fig. [8] we also show the error amplitude between estimated beams and the orig- 
inal beams. For a quantitative comparison, we have calculated the total squared error 
between the original beams and estimated beams for the full field of view, sampled 
at 30 x 30 grid points. For the unconstrained case, we get a total error of 152 while 
for the proposed case we only get an error of 130. Comparison of the final cost of 
/(B) in © at the end of each algorithm shows a different result. With conventional 
optimization, we get a much lower cost for (O compared with the proposed method. 
This is clearly a misleading result due to the ill-posedness of the problem. 

In Figs. [9] and [10] we have shown the error in estimating the intrinsic flux using 
(1201 . Both figures show the difference of the estimated flux with the true flux by 
the size of the circles. In Fig. [9] we have used the beam estimate obtained using 
the unconstrained approach while in Fig. [I0]we have used the beam shape obtained 
using the proposed approach. Both beam estimates give good recovery of the true 
fluxes within the inner region of the field of view. 

The sources at the outlier clearly shows an error in the recovered flux, mainly be- 
cause their apparent flux is low and are more susceptible to noise. Therefore, in order 
to improve the flux estimation of outlier sources, we can use diversity in frequency 
and time. Because the sky is almost invariant, we can combine beam estimates ob- 
tained over different time and frequency intervals to improve the flux estimates of 
outlier sources. 



6 Conclusions 

We have proposed a method of estimating interferometer beam shapes used in a radio 
interferometric observation by using the directional gains obtained towards known 
celestial sources. This ill posed problem is solved using optimization on a Rieman- 
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nian manifold. As compared with conventional optimization, the proposed method 
give better results. However, the proposed method is computationally more expen- 
sive than conventional (unconstrained) optimization. Future work will address the 
application of this method to real interferometric observations and reducing the com- 
putational cost. 
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Fig. 5 Reduction of the cost function with the number of iterations for the proposed algorithm, (a) linear- 
log scale (b) log-log scale. The sudden jumps of the cost occur when RBFGS finds a successful solution. 
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Fig. 6 Estimated beams using unconstrained BFGS optimization: (a) Real value (b) Imaginary value. The 
total squared error between the original beams and the estimated ones is about 152. 
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Fig. 9 The error (difference between estimated flux and true flux) in estimated fluxes using the beam 
shape obtained by unconstrained optimization. The size of the circles represent the magnitude of the 
absolute error. 
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Fig. 10 The error (difference between estimated flux and true flux) in estimated fluxes using the beam 
shape obtained by the proposed approach. The size of the circles represent the magnitude of the absolute 
error. 
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A Proof of ©: 

Let the total power received by all stations be r. We can express this as 

r = J]\y pll ,\ 2 = ^|epBb,„| 2 (21) 

pjn p.m 

where p is summation over all stations (1, ... , N) and m is summation over an infinite number of directions 
in the sky that covers the full field of view. Note that the summation over m is not restricted to the directions 
where we have known sources. We have 



r = trace 



and using the fact that 



we get 



Let 



Then, 



r = trace 



in p ) 

J^b^B"!*). (24) 

III I 

rr H = YjQ>mb%). (25) 



r = trace ((Br)"(BT)) = ||BT|| 2 < ||B|| 2 ||r|| 2 (26) 

Taking into account that ||T|| 2 is fixed for a given basis, we can keep T below a certain level by keeping 

||B|| 2 = trace (B H h) = a (27) 

where a is a fixed real value. One additional point to be raised here is that by selecting an orthonormal 
basis, we get T « I, therefore an orthonormal basis is always preferred (although in practice hard to 
realize). 



B Proof of©: 



We can rewrite (6) as 

/(B) = ^ trace ((X„„, ® Z w ,„ - Y pqm ® 

1) O^pqm ® Zpqm ~ ^ pqm 8> 1)) (28) 

p,q,m 

where X pqm = C pqm , Z pqm = elBb,„b"ft H e q , Y pqm = 3 pm C pqm J qm and 1 = 1. This can be simplified as 
/(B) = trace(Xp qm X pqm )trace(Z pqm Z pqm ) (29) 

p.qjn 

- trace(X" qm Y pqm )trace(Z" qm ) 

- trace(Y" qm X pqm )trace(Z pqm ) 
+trace(Y pqm Y pqm ). 

Using iHiorungnes and Gesberj j2007l) . we can take the derivative of each term with a trace of Z^ ( y/„ to 
yield {§}. 



